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Genome wide association studies (GWAS) have been utilized to identify genetic risk loci associated 
with both simple and complex inherited disorders. Here, we performed a GWAS in Labrador retrievers 
to identify genetic loci associated with hip dysplasia and body weight. Hip dysplasia scores were 
available for 209 genotyped dogs. We identified a significantly associated locus for hip dysplasia on 
chromosome 24, with three equally associated SNPs (p= 4.3 x 10~’) in complete linkage disequilibrium 
located within NDRG3, a gene which in humans has been shown to be differentially expressed in 
osteoarthritic joint cartilage. Body weight, available for 85 female dogs, was used as phenotype for 

a second analysis. We identified two significantly associated loci on chromosome 10 (p=4.5 x 107”) 
and chromosome 31 (p=2.5 x 10). The most associated SNPs within these loci were located within 
the introns of the PRKCE and CADM2 genes, respectively. PRKCE has been shown to play a role in 
regulation of adipogenesis whilst CADM2 has been associated with body weight in multiple human 
GWAS. In summary, we identified credible candidate loci explaining part of the genetic inheritance 
for hip dysplasia and body weight in Labrador retrievers with strong candidate genes in each locus 
previously implicated in the phenotypes investigated. 


Canine genome wide association studies (GWAS) have been proven to be a powerful tool to identify genetic 
loci associated with both simple and complex inherited conditions in dogs'°. Hip dysplasia (HD) occurs com- 
monly in dogs and has a major impact on animal welfare and life expectancy for the affected dogs*. The disease is 
considered heritable with an estimated heritability of between 0.2 and 0.6 and a presumed polygenic etiology*’. 
However, environmental factors such as excessive food intake and rapid weight gain in growing animals as 
well as certain types of physical activities can increase the risk of developing clinical disease**. Treatment for 
clinical disease includes pain management, physiotherapy, and surgical orthopedic procedures!*"'!. Due to the 
heritability of the disease, screening dogs within high-risk breeds, such as German shepherds and Labrador 
retrievers, prior to breeding has been a major focus from kennel clubs and dog breeders as this has been shown 
to reduce frequency of moderate and severe HD’. Screening for HD is done by radiographic examination of 
the hip joint with the legs extended and abducted whilst the animal is heavily sedated or anaesthetized to allow 
for relaxation of the joint. Radiographic images are subsequently graded based on a standardized radiographic 
scoring scheme. The procedure for radiographic examination and subsequent grading has been standardized 
within the Fédération Cynologique Internationale (FCI)'*. It has been shown that there is a good correlation 
between the radiographic evaluation of hip conformation and the development of clinically detectable HD later 
in life. Hence, selected breeding based on HD screening can be expected to reduce the prevalence of HD over 
time*?°. However, due to the complexity and presumed polygenic inheritance of HD as well as environmental 
factors influencing the phenotypic development, radiographic screening of breeding stock is not a warranty for 
unaffected offspring. Hence, other predictive measures such as genetic testing combined with estimated breeding 
values (EBVs) would provide more robust tools for breeding strategies against HD’. 
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Multiple studies have investigated the genetic cause of HD in dogs by GWAS varying in the number of SNPs 
used, the breed of interest and the phenotypic HD classification’**°. Although there is overlap between risk loci 
identified between some studies, and some studies investigate previously identified risk alleles in independent 
populations, no studies so far have validated that genetic markers can be used as a complement or as an alterna- 
tive to radiographic screening. Several published genetic studies have focused on Labrador retrievers, as this is a 
popular breed which has been shown to have a higher risk of developing HD than some other breeds*®!>!81971)?, 
However, a genetic test based on a previous GWAS study in Labrador retrievers failed to correlate with the HD 
score when tested in an independent Labrador retriever cohort!>”’. 

Body weight or body size are highly heritable but are complex traits as evidenced by studies both in humans 
and animals**’®. The Labrador retriever is a dog breed which is known to be food motivated and at risk of devel- 
oping obesity’’”*. Previously, a deletion affecting the POMC gene was found to be associated with body weight 
and food motivation in Labrador retrievers”. As obesity is rarely inherited as a monogenic trait, it is likely that 
additional genetic loci play a role for these traits in Labradors*°. 

The aim of this study was to perform GWAS analyses on Labrador retrievers to identify genetic risk loci 
for HD and body weight based on Swedish Kennel Club (SKK) registered phenotypic data. In addition, color 
phenotypes were included as examples of simple Mendelian inherited phenotypes, to show proof of the method 
used for correction of population structure and data inflation in this cohort. 


Results 

GWAS analysis using color as the dependent phenotype 

Genotyping data and color phenotypes were available for 209 SKK registered Labrador retrievers consisting of 205 
females and 4 males (supplementary file $1). As a proof of concept, we performed a GWAS analysis using color 
as the dependent variable. This was performed to confirm that known genetic loci could be identified using this 
dataset and to show that the linear mixed model was able to correct for population structure and data inflation. 
We evaluated the population structure within the genotyped dogs by drawing a multidimensional scaling (MDS) 
plot depicting the two first dimensions (C1 and C2) and reflecting the color phenotype for each individual dog 
(Supplementary Fig. $1). The MDS plot showed that there is some population structure within the genotyped 
dogs with an unequal distribution of the brown phenotype across the dimensions. 

We performed two separate GWAS analyses using color as the dependent phenotype. The first analysis com- 
pared black versus yellow dogs. This analysis included 148 black and 38 yellow Labrador retrievers with 115,072 
SNPs passing data filtering. A basic association test correctly identified the MCR1 locus on chromosome 5 as 
the most associated locus (chr5: 62,870,500, Ppiack/yellow= 1.4 10°”), however, the results were inflated with a 
lambda value of A=1.39, as seen on the QQ-plot (Supplementary Fig. S2a and b). A subsequent analysis, using a 
linear mixed model, corrected for the data inflation with a resulting lambda value of 1 =0.96 and also correctly 
identified the MCR1 locus (chr5:64,358,997, Ppiack/yellow= 1-49 X 10-5), (Supplementary Fig. $2c and d). 

The second analysis was performed comparing 148 black against 22 brown Labrador retrievers. A basic 
association test correctly identified the TYRPI1 locus as the most associated locus (chr11:33,578,561, 
Pblack/brown = 1.8 x 1078). However, the results were inflated with a lambda value of  =3.06, as visualized on the 
QQ-plot (Supplementary Fig. S3a,b). Applying a linear mixed model resulted in a mild overcorrection of the 
inflation with a lambda value of = 0.90 with the most associated locus (chr11:33,578,561, Pplack/brown = 7-0 X 107*°) 
still being identified as the TYRP1 locus (Supplementary Fig. S3c,d). 

The “chip heritability”, ie. the percentage of the phenotypic variation, that could be explained by the 
genetic variation in the genotyping data (PVE), was calculated for each trait using GEMMA”". This resulted in 
PVE black/yellow = 0-96 +/— 0.16(SE) and the PVE pjackbrown = 0.93 +/— 0.12(SE), which is in alignment with expecta- 
tions for the inheritance of these color traits*”. 

Having showed that the linear mixed model was able to correct for genomic inflation, whilst correctly iden- 
tifying the color loci, we applied this analytical method to analyze our data with HD grading and body weight 
as the dependent variables. 


GWAS analysis using the numerical HD score as the dependent variable 

Genotyping data and matched numerically translated HD scores, as described in the materials and methods, 
were available for 209 SKK registered Labrador retrievers representing 205 females and 4 males. All individu- 
als and 114,854 SNPs passed the quality control and filtering. An MDS plot showing the two first dimensions 
(C1 and C2), reflecting the numerical translated score of the included dogs, is shown in Fig. 1a together with 
a summary of the scores (Fig. 1b). As previously observed, the MDS plot shows some evidence of population 
structure that could affect the analytical process. A basic GWAS association test was performed in PLINK using 
the numerically translated HD value as the dependent continuous variable**. The analysis identified the strong- 
est association to a locus on chromosome 24, with some degree of data inflation reflected in the QQ-plot and 
the lambda value 4 =1.31 (Fig. 2a,b). A linear mixed model was applied correcting for the inflation, yielding a 
lambda value of i =1.05 (Fig. 2c). A single significant locus was identified on chromosome 24, represented by five 
significantly associated SNPs, reaching the LD-corrected Bonferroni threshold (Fig. 2d, supplementary Table 1). 
The three strongest associated SNPs (chr24:25,406,999, chr24:25,417,878, chr24:25,465,028) were equally associ- 
ated (pyp =4.3 x 10°”), in complete LD (R?= 1.0), and are located within the NDRG3 gene (Fig. 2e,f). One of these 
SNPs at chr24:25,465,028 is positioned in the splice donor position -3 of exon 4 in NDRG3. When evaluating the 
LD structure surrounding the most associated SNPs we found a 0.7 MB locus (chr24:24,945,587-25,656,879) 
in high LD (R?>0.8) with the most associated SNPs. This region encodes multiple candidate genes (EPB41L1, 
AAR2, DLGAP4, MYL9, TGIF2, ENSCAFG00000029848, NDRG3, DSN1, SOGA1, C20orf118 and SAMHD1) and 
the locus has previously been associated with HD in a GWAS performed in German shepherds and in a targeted 
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Figure 1. (a) Multidimensional scaling plot showing the genetic distance between the 209 genotyped Labrador 
retrievers in a two-dimensional space. Each datapoint represents an individual dog and the shape and color of 
the datapoint reflects the hip dysplasia (HD) score for the individual. Filled circle Represents individuals with 
grade A HD score (Normal hip score A). Grey circle represents individuals with grade B HD score (Normal 

hip score B). Filled triangle Represents individuals with grade C HD score (Mild hip dysplasia). Open circle 
Represents individuals with grade D HD score (Moderate hip dysplasia). Open triangle p represents individuals 
with grade E HD score (Severe hip dysplasia). (b) Summary of the actual and numerically translated HD score 
and the number of individuals in each category of the study population. 


study validating selected HD associated SNPs in Labrador retrievers*”**. A list of nominally significant loci was 
generated based on deviation from the expected p-value (supplementary file S2). 

The percentage of phenotypic variability explained by the dataset, as calculated by GEMMA, was 
PVEyp = 0.36 +/— 0.22(SE), which is in alignment with the previously reported heritability for HD in Labrador 
retrievers’. Body weight at the time of HD scoring was available for a subset of individuals n =89. We examined 
if there was any difference in the body weight between the HD score groups however, we did not find any sig- 
nificant difference between groups p=0.26 (Supplementary Fig. S4). 


Evaluation of EBV, in relation to the associated genotype on chromosome 24 

Estimated breeding value is an indirect prediction of the genetic merit of a trait and can be used to predict the 
likelihood of an individual passing on a genetic trait to its offspring. The EBV for HD (EBVy,p) was available for 
205 of the 209 genotyped dogs. To validate the association between the identified risk genotype and HD status, we 
investigated if there was a difference in the EBVy;p between different genotype groups (Fig. 4a). The EBVy;p was 
compared between dogs that did not carry the risk allele and dogs that were either heterozygous or homozygous 
for the risk allele. As data was not normally distributed, as determined by a normality testing, Kruskal-Wallis 
and Dunn’s multiple comparison test was used to compare the EBVyp between groups. We found that there 
was a significant difference between the mean EBVyp in the groups (p= 0.003) (Table 1). When performing 
intergroup comparisons, we found that dogs that were homozygous for the risk allele had significantly lower 
(i.e. worse) EBVyp than dogs that did not carry the risk allele (p = 0.0023) or dogs that were heterozygous for 
the risk allele (p = 0.023) (Fig. 4a). 


GWAS analysis using body weight as the dependent phenotype 

Body weight measured at the time of HD scoring was available for 89 dogs, 85 females and 4 males. As body 
weight differs significantly between males and females, we excluded the 4 males from the dataset to avoid an 
imbalanced sex bias”*. To evaluate the effect of age on body weight, we performed a simple linear regression 
test comparing age and body weight. We did not find any correlation between increasing age and body weight 
(R?=0.0003 and p=0.87 in the female dogs) (supplementary Fig. $5). We evaluated the population structure 
within the genotyped dogs by drawing an MDS plot depicting the two first dimensions (C1 and C2) and depict- 
ing dogs by weight groups. As previously seen the population structure is still observed in this smaller dataset 
(Supplementary Fig. S6). It is also noted that the weight categories are not equally distributed across the plot. 
GWAS analysis was performed using body weight as a continuous variable. After filtering and quality control, 
data from 85 dogs and 113,822 SNPs were included in the analysis. A basic association test was used to identify 
a locus on chromosome 10 as the most associated locus. However, the results were highly inflated with a lambda 
value of A =3.18 (Fig. 3a,b). A second analysis using a linear mixed model, markedly reduced the inflation and 
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Figure 2. GWAS analysis based on hip dysplasia scores in 209 Labrador retrievers. (a) QQ-plot showing the 
expected versus the observed p-value from the basic association test. (b) Manhattan plot showing the genetic 
position across the 39 chromosomes versus the —log10 transformed p-value from the basic association test. (c) 
QQ-plot showing the expected versus the observed p-value from the linear mixed model association test. (d) 
Manhattan plot showing the genetic position across the 39 chromosomes versus the —log10 transformed p-value 
from the linear mixed model association test. The red line delineates the Bonferroni significance threshold. The 
blue line delineates a nominal significance threshold based on the deviation of the QQ-plot from the expected 
p-value. (e) LD plot reflecting the pairwise R? between SNPs in a 2,000 kb window centered around the most 
associated SNP from association analysis. (f) Closer view of the associated locus on chromosome 24 reflecting 
the LD structure in relation to the most associated SNP (g) Annotated genome view of the area covering the 
three most associated SNPs on chromosome 24 which are in complete LD and are positioned within the coding 
and non-coding part of the NDRG3 gene. 
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Figure 3. GWAS analysis based on the body weight of 85 female Labrador retrievers. (a) QQ-plot showing the 
expected versus the observed p-value from the basic association test. (b) Manhattan plot showing the genetic 
position across the 39 chromosomes versus the —log10 transformed p-value from the basic association test. (c) 
QQ-plot showing the expected versus the observed p-value from the linear mixed model association test. (d) 
Manhattan plot showing the genetic position across the 39 chromosomes versus the —log10 transformed p-value 
from the linear mixed model association test. The red line delineates the Bonferroni significance threshold. 

(e) Closer view of the associated locus on chromosome 10 reflecting the LD structure in relation to the most 
associated SNP (f) Closer view of the associated locus on chromosome 31 reflecting the LD structure in relation 
to the most associated SNP (g) Annotated genome view of the locus on chromosome 10 delimited by the most 
associated SNP and the two SNPs in highest LD with the associated SNP. (h) Annotated genome view of the area 
covered by the four most associated SNPs on chromosome 31 which are equally associated and in complete LD. 
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EBV HD 


Genotype chr24:25406999 | A/A | A/G | G/G_ | P-value 
Mean EBVyp 99.1 | 97.7 | 86.0 0.003 
Number of individuals 147 | 48 10 

Genotype chr10:48367568 cic | C/T | T/T P-value 
Mean weight in Kg 21.8 | 27.8 |29.5 |3.18x10°? 
Number of individuals 14 27 44 

Genotype chr31:1938609 cic | C/T | T/T P-value 
Mean weight in Kg 26.6 | 31.6 | 33.5* | 2.45x10°% 
Number of individuals 67 17 1 


Table 1. Summary of the statistics from the data analysis shown in Fig. 4 a-c). P-values for the Kruskal-Wallis 
test (EBV}p) and the one-way ANOVA (body weight) are shown. 
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Figure 4. Box plots showing differences in phenotypic variables dependent on genotype. (a) Box plot 
showing the difference in mean estimated breeding value for HD (EBVyp) between dogs homozygous non 
risk (A/A) heterozygous risk (A/G) and homozygous risk (G/G) for the SNP chr24:25,406,999 associated 
with hip dysplasia. Significant difference is observed within the dataset. Group-wise comparison shows that 
homozygous risk individuals have a significantly lower (i.e. worse) EBVyp than other dogs. (b) Box plot 
showing the difference in mean body weight between dogs homozygous non risk (C/C) heterozygous risk (C/T) 
and homozygous risk (T/T) for the SNP chr10:48,367,568 associated with increased body weight. Significant 
difference is observed within the dataset. With group wise comparison homozygous non risk individuals 
have a significantly lower body weight than the other groups. (c) Box plot showing the difference in mean 
body weight between dogs being homozygous non risk (C/C) heterozygous risk (C/T) and homozygous risk 
(T/T) for the SNP chr31:1,938,609 associated with increased body weight. Significant difference is observed 
within the dataset. With group wise comparison homozygous non risk individuals have a lower body weight 
than heterozygous risk carriers. Not significant (ns) = p-value > 0.05, * = p-value < 0.05, ** = p-value < 0.01, 
= p-value < 0.001, **** = p-value < 0.0001. 


resulted in a lambda value of 4 =1.04. Two loci, on chr10:48,367,568, ppw=4.5 x 10°” and on chr31:1,938,609, 
Paw =2.5 x 10°, were identified with SNPs reaching the LD corrected Bonferroni threshold (Fig. 3c,d). The most 
associated SNP on chromosome 10 is located within an intron of the PRKCE gene, a gene previously linked to 
adipogenesis and body weight**”*. The locus on chromosome 10 is a narrow locus (Fig. 3e). The SNPs in high- 
est LD with the locus defining SNP (R?>0.6) defines a 220 kb region containing the PRKCE and EPAS1 genes 
(Fig. 3g). The locus on chromosome 31 included four equally associated SNPs in complete LD (R?= 1.0) cover- 
ing a 180 kb region spanning the CADM2 gene (Fig. 3f,h), which is a gene that has been linked to obesity in 
multiple human studies looking at body composition in both children and adults*”~’. In the NHGRI-EBI GWAS 
catalogue, BMI is listed as the most common trait associated with CADM2*. Evaluation of the LD structure 
surrounding the SNP marking the locus, showed a 450 kb locus in high LD (R?> 0.8) with CADM2 being the 
only gene located in the region. 

The percentage of phenotypic variability explained by the dataset in relation to body weight, as calculated by 
GEMMA, was PVEwprcur= 0.94 + /- 0.31(SE). 
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Relationship between body weight and genotype for the two associated loci 

To evaluate the effect of genotype on weight we performed a one-way ANOVA analysis evaluating the difference 
between body weight in relation to the genotype for the two associated loci. We found that there was a signifi- 
cant difference in mean weight between the genotype groups representing the two risk loci on chromosome 10 
Pawio= 3.2 x 10°” and chromosome 31 pgw3) =2.4 x 10°° (Table 1). Further, there was a significant difference 
in weight between dogs not carrying the risk allele on chromosome 10 and dogs being either heterozygous 
Pecict = 4.4 x 10-8 or homozygous pecrr = 6.7 x 107! for the risk allele. A significant difference between the 
weight of dogs not carrying the risk allele and dogs being heterozygous for the risk allele on chromosome 31 
Pccicr= 3-5 x 10°, was also found. However, only one dog was homozygous for the risk allele on chromosome 
31 and hence no further meaningful statistical testing could be performed for homozygosity of this risk allele. 
Finally, a combined analysis was performed looking at the difference of body weight in dogs not carrying any 
risk alleles on chromosome 10 and chromosome 31 and dogs carrying different combinations of risk alleles for 
the two loci on chromosome 10 and chromosome 31 (Table 2 and Fig. 5). A statistically significant difference in 
body weight between dogs not carrying any risk alleles and dogs carrying any combination of risk alleles for the 
two associated loci on chromosome 10 and chromosome 31 was found. 


Genotype 

Chr 31/Chr10 N Mean weight P-value Dunnets multiple comparison to non-risk 
C/CC/C 14 21.8 

C/C C/T 21 27.2 1.38 x 10-7 

C/C T/T 32 28.4 3.18 x 107"! 

C/T C/T 6 30.0 2.38x 10° 

C/T T/T ll 32.5 2.40 x 10714 

T/T T/T 1 33,5 1.54x 10-4 * 


Table 2. Summary of the comparison between individuals not carrying any risk alleles for increased body 
weight and individuals carrying different combinations of risk alleles representing the two associated loci 

on chromosome 10 and 31. Risk allele shown in bold. P-values represent statistical comparison between the 
control group not carrying any risk alleles and groups carrying different combinations of risk alleles. *Only one 
sample was homozygous for both risk alleles and hence this should be taken into account when interpreting 
the data. 
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Figure 5. Boxplot illustrating the difference in weight between different groups. Labels indicate the genotype 
for the dogs in each group as reflected in Table 2 with T being the risk allele for each locus. 
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Discussion 

In this study a GWAS investigating the genetic association of HD and body weight in Labrador retrievers based 
on SKK registered phenotypic data, was performed. This identified a locus on chromosome 24 associated with 
HD scores in Labrador retrievers. Subsequently we evaluated if the EBV syp calculated by the SKK were different 
between groups either not carrying the risk allele or being hetero- or homozygotic for the risk allele. The EBVyp 
was found to be lower (i.e. worse) in individuals carrying the risk allele, with the most significant effect observed 
in the group of individuals homozygous for the risk allele compared to the heterozygotes and non-carriers. The 
identified risk locus contains three equally associated SNPs in complete LD located within the NDGR3 gene. 
NDGR3 is a gene which has been shown to be strongly expressed in osteoarthritic joint cartilage in humans and 
is proposed to play a role in the hypoxia mediated development and progression of osteoarthritis*!”. A locus 
adjacent to the NDRG3 locus on chromosome 24 has previously also been shown to be associated with HD in 
dogs. Fels & Distl (2014) performed a GWAS on German shepherds comparing 96 controls (dogs graded as nor- 
mal, A or B, at screening) with 96 cases representing HD grade C-E, based on the FCI protocol**. In that study, 
in which a smaller and less dense SNP panel with 127 k SNPs was used, the most associated locus identified in 
the study was found on chromosome 24 with the most significant SNP being located at chr24:25,973,438, in the 
vicinity of the locus identified in our current study. That study validated the risk locus by genotyping the SNP in 
a cohort of 843 German shepherds including 277 non-dysplastic dogs and showed that the risk allele frequency 
was higher in the dysplastic group with an odds ratio of 4.7 (95% CI 3.72-5.94). In a more recent study by Mik- 
kola et al. from 2021, a panel of 46 SNPs including the chr24:25,973,438 SNP were evaluated in 1600 dogs from 
10 different breeds”. The study showed that the risk allele chr24:25,973,438, was associated with an increased 
odds ratio (OR= 1.87, 95% CI 1.07-3.28) of having HD in Labrador retrievers”. Finally, a study using a whole 
genome sequencing GWAS approach with phenotypes based on breed disease frequency rather than individual 
phenotypes, identified a locus overlapping with the current locus on chromosome 24. The locus was shown to be 
associated with anterior cruciate ligament rupture, elbow dysplasia and hip dysplasia in different breeds, includ- 
ing Labrador retrievers'’. Overlap between the loci identified in these studies and the current study has been 
summarized in supplementary Fig. S7. With these previous studies in mind, the identified locus in the current 
study most likely has implications for HD in Labrador retrievers, though further functional and epidemiologi- 
cal validation studies are warranted to confirm the risk association and validate putative functional variants. 

In our study, two loci significantly associated with body weight in female Labrador retrievers were identified. 
Subsequent testing showed that there is a significantly higher body weight in the group of individuals carrying 
either one or two copies of the risk allele in either locus. This was also confirmed when looking at the two risk 
loci combined. The risk locus on chromosome 10 is a narrow locus with the most associated SNP located in the 
PRKCE gene. This gene was identified as a novel locus associated with body weight and BMI in a large-scale 
human study investigating anthropometric traits**. The associated SNPs on chromosome 31 are located within 
the CADM2 gene. A knockout of the CADM2 gene was shown to protect against obesity in mouse models and 
CADM2 has been identified as a risk locus for increased BMI in multiple human studies*”**""’, Further, evalua- 
tion of genotype dependent differential expression of CAMD2 in different brain regions, including hypothalamus, 
has been shown to correlate with genotype in identified human CADM2 risk alleles, validating their effect on 
gene expression’. With the comparative evidence for the CADM2 gene’s role in obesity, this is a strong candidate 
locus for weight regulation in Labrador retrievers, which should be explored further to potentially understand 
the genetic cause of obesity within this dog breed. 

The PVE explained by the data was calculated for all traits in this study. The PVE differed between the inves- 
tigated phenotypes. Though the PVE was within the expected range for the HD and color association, it was 
higher than expected for the weight association, reflecting that body weight is a highly heritable trait for which 
a large proportion of the variation can be explained by genetics. Taking the confidence intervals into account, 
the PVE is not far from the heritability described for body weight within different dog breeds”. As we only had 
body weight and not body condition score, part of the weight phenotype in our study likely reflects dog size 
rather than solely obesity, which can explain the high PVE for this trait. 

There are certain strengths and limitations in this study. We used a conservative threshold for determining 
significance, by using the significance threshold established by Bonferroni correction based on the number of 
independent SNPs in the dataset. Other GWAS studies have used different thresholds such as nominal signifi- 
cance based on deviation of the observed p-value from the expected p-value based on the Q-Q plot’. Here we 
employed a conservative threshold increasing the risk of missing possible loci not meting the conservative cut-off. 
However, the SNPs marking additional putative loci have been summarized in Supplementary file 2. In the vicin- 
ity of these SNPs it is observed that there are many additional candidate genes for future exploration. A strength 
of the current study is the high-quality phenotypic data being registered systematically in the SKK database and 
being collected in a stringent manner. However, the current study population only consisted of 31 dogs with HD 
scores C or worse, and a larger dataset with more severely affected individuals could have increased the power 
of detection. Another limitation is the inclusion of only female dogs in the weight analysis. This was done to 
avoid a strong bias towards the X chromosome, which is commonly found when performing an analysis includ- 
ing individuals of both sexes unless the X chromosome is discarded from the analysis’. We used body weight 
and not body condition score, or another more accurate measure of obesity, in this study and hence we cannot 
exclude that the weight reflects general larger body size and not necessarily obesity. Nevertheless, the associated 
loci identified have previously been associated with body mass and body weight in humans and animals. 

In conclusion, this study identified genetic loci significantly associated with HD and body weight in Swedish 
Labrador retrievers. The identified loci intersected promising candidate genes with known function related to 
the observed phenotypes. Further validation studies should be performed in the future to fully understand the 
functional consequences of the risk SNPs and to identify putative causal variants. 
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Materials and methods 

Sample collection and phenotypes 

Ethylenediaminetetraacetic acid (EDTA) preserved blood samples were collected from Labrador retrievers as 
part of two studies investigating the genetic predisposition to severe uterine infection (pyometra) and atopic 
dermatitis in Swedish Labrador retrievers. As most samples were collected as part of the pyometra study, the 
dataset has an overrepresentation of females. Phenotypic data consisting of color, weight, HD score and individual 
HD estimated breeding value for HD (EBVyp), was extracted from the SKK registry based on the individual 
dog’s registration number. The EBVyp reflects the likelihood of the dog passing on HD to their offspring and are 
expressed in relation to the breed average which is set at 100. Values above 100 reflect better than breed average 
and a decreased propensity to pass on HD to offspring, whilst values below 100 reflect worse than breed average 
and an increased propensity to pass on HD to offspring. HD scores based on the official protocol of the FCI with 
grades A-E were available for 204 dogs. Scores A and B represent normal (non-dysplastic) hip joints, C mild 
HD, D moderate HD and E severe HD. The scores were translated into a numerical value based on the scale 
presented by Malm et al.*°. HD scores were available for 204 dogs based on the current FCI approved system. 
However, for a subset of five individuals the hips were scored based on an older Swedish scoring system which 
was translated into the FCI approved system’***, 


DNA extraction 
Genomic DNA was extracted from EDTA blood using the QIASymphony robot (Qiagen, Hilden, Germany) 
together with the QlAamp DNA Blood Midi Kit (Qiagen, Hilden Germany). 


Genome-wide genotyping 

Genomic DNA from each dog was genotyped using the Illumina 170 K CanineHD BeadChip based on the 
CanFam3.1 (Illumina, San Diego, CA, USA)*’. Genotyping of samples was performed at the Centre National 
de Genotypage (France) and NeoGen Genomics genotyping platform (NeoGen Genomics, Lincoln, NE, USA). 


Data filtering and visualization of population structure 

Data filtering, sex confirmation and visualisation of population structure was performed using the software 
PLINK v1.90 *°. SNPs with a minor allele frequency (MAF) of less than 5% and SNPs which failed to be genotyped 
in more than 5% of samples were removed (-maf 0.05, -geno 0.05). Individuals with more than 5% genotyping 
data missing were removed (-mind 0.05). A sex check was performed to assure that the registered sex overlapped 
with the genotype predicted sex. A MDS plot was generated, and the first two dimensions (C1-C2) were plotted 
against each other to visualize population structure. 


GWAS analysis and visualization 

A basic association test was performed with each phenotype using PLINK v1.90**. To account for cryptic related- 
ness and population structure, a univariate mixed model GWAS analysis was performed using the Genome-wide 
Efficient Mixed Model Association (GEMMA) software*®. The software was used with a centered genotype matrix 
and Wald test statistics for statistical analysis. QQ-plots and Manhattan plots were generated in RStudio 2022.07.2 
using the software package Fastman **°°. The significance threshold for each association was determined using an 
LD corrected Bonferroni significance threshold based on the number of SNPs that were not in near-complete LD 
as determined by evaluating SNPs within 100 kb windows with an R*>0.9 as calculated by PLINK v1.90 (-indep 
100 10 10)°'. The proportion of phenotypic variance explained by the genotyping data (Chip heritability) was 
calculated for each linear mixed model analysis using GEMMA“. 


Additional statistical testing 

Statistical testing was performed evaluating differences between phenotypes grouped by associated risk genotype. 
First, the dependent variables were tested to evaluate if they followed a normal distribution by a D’Agostino-Pear- 
son test. For variables that followed a normal distribution, a one-way ANOVA analysis was performed to evaluate 
if there were significant differences within the dataset. This was followed by a Tukey’s multiple comparison test 
to evaluate the difference between individual genotypes, adjusting p-values for multiple testing. The EBVyp was 
not normally distributed in its native or log transformed format. Hence, differences in EBVyp between genotype 
groups were tested using a non-parametric Kruskal-Wallis test followed by a Dunn’s multiple comparison test, 
adjusting p-values for multiple testing. Linear regression analysis was used to evaluate any correlation between 
age and body weight. Graphical visualisation and analysis of the statistical tests was performed in Prism v. 10.0.2. 


Human GWAS comparison 
The NHGRI-EBI GWAS catalog was used to evaluate if genes in the associated loci had been implicated in human 
studies investigating similar phenotypes”. 


Ethical approval 

All blood samples included in this study were collected with owners informed written consent. Samples were col- 
lected in agreement with relevant guidelines and regulations. Ethical approval was granted by the regional animal 
ethics committee Uppsala ethics committee on animal experiments (Uppsala djurforséksetiska namnd) refer- 
ence numbers: Dnr C12/15, D318/9, C139/9. Methods are reported in accordance with the ARRIVE guidelines. 
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Data availability 
Phenotypic and genotyping data can be downloaded at SciLifeLab https://doi.org/10.17044/scilifelab.25334122. 
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